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Resampling  Statistics  Tutorial 
75th  MORS  Symposium  USNA 


Presented  by: 

Greg  Hutto 

Chief  Operations  Analyst 
53'’^  Wing 

Air  Combat  Command  USAF 
qreqorv.hutto@eqlin.af.mil 


Tutorial  Overview 


Baron  Von  Munchhausen 


□  A  couple  of  motivational  problems 


□  Why  we  care  &  Resampling  History 

□  Resampling  vs.  Classical  Approach  to  Statistical 
Testing 

■  The  one  sample  bootstrap  (binomial  p  to  1  sample 
t)) 

■  The  two  sample  bootstrap  &  shuffle  (two  sample  t) 

■  One  sample  -  sample  size  (noncentral  t  power) 

■  1  Way  AN  OVA  (F  test) 

■  Simple  linear  regression 

■  2  Way  ANOVA 

■  Medians,  correlations,  percentiles 

□  Potential  Advantages  of  Adopting  Resampling 

■  Teaching 

■  Dealing  with  missing  data 

■  Estimating  sample  size 

■  Nonnormal  data 

■  Getting  the  "right"  answer  in  Analysis 

□  Caveats  &  Directions  for  the  Future 
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Baron  Von  Munchhausen 


Wh^w^ai^abo^  resampling 

□  It's  all  about  test  productivity  how 
many  good  tests  can  you  run  with  $48 
million  and  2200  people? 

□  As  of  Spring  2002,  the  53'"'^  Wing  has 
changed  its  method  of  test  to  Design  of 
Experiments  (DOE) 

□  We  have  350+  project  officers  and  70 
analysts  to  teach  on  a  rotating  basis 

□  More  than  20  geographically  separated 
units 

□  As  of  last  count,  256  test  projects  on  the 
ACC  Test  Priority  List  (March  2007) 

□  We  currently  service  less  than  75%  of 
these  each  year 

□  And  ..  Post-Sept  11,  the  pace  continues 
to  quicken 
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Problems  to  motivate 
usefulness  of  resampling 


Baron  Von  Munchhausen 


■ 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

i 


□  Probability  of  1  boy  &  1  girl  in  a 
two  child  family? 

□  Given  that  one  child  is  a  boy  in  a 
two  child  family,  probability  the 
other  child  is  also  a  boy? 

□  JASSM  Reliability  is  low.  If  "fixed" 
and  now  80%  can  we  determine 
that  with  20  shots? 

□  Joint  Programmable  Fuze  is  90%+ 
reliable.  Or  is  it?  We  had  2  or  3 
of  6  failures. 

□  The  SWEAT  diet  is  being  proposed 
to  get  our  Airmen  "Fit  to  Fight' 
will  it  work? 
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Why  Resample?  Information 
atherinq  &.  data  analysis 


Baron  Von  Munchhausen 


□ 

□ 

□ 


□ 


□ 


□ 


Answering  a  research  question 

Estimating  an  uncertain  quantity 

Determine  sampling  distributions 
of  statistics  whose  distributions 
cannot  or  have  not  been 
mathematically  approximated 

To  avoid  doubtful  assumptions 
about  the  data 

In  the  case  of  permutation  tests, 
to  arrive  at  exact  p-values 

To  use  a  conceptually  simple, 
widely  applicable  general  method 


r 

P 

P 

1i  mp  9007 
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Caveat  -  Resampling  is  a  woe 
in  progress  in  our  Wing 


Baron  Von  Munchhausen 


I  □  Col  (Dr)  Pete  Vandenbosch 
I  suggested  we  consider  resampling 

T  in  January  02 

I  □  Used  2002  MORS  Symposium  as  a 
I  forcing  function  to  get  to  work  on 
E  solving  the  problems  of  applying 

I  □  Made  some  progress  -  Overall,  a 

I  B+ 

P  ■  Teaching  -  A 

P  ■  Nonnormal  data  -  B+ 

P  ■  Missing  data  -  B- 

J  ■  Sample  Size  -  A 

P  ■  Applying  to  DOE  -  B 

E  □  More  directions  at  the  summary 
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Teaching  Difficult 
Statistical  Concepts  via 
Resampling 


Baron  Von  Munchhausen 


ECM 

Wet 


Dry 


Offset 


Clutter 


Short 


Low 


Long 


Short 


Long 


AutoTrk 


Low 


Low 


Low 


High  Altitude 


12.1 


13.6 


14.7 


12.2 


ManTrk 


AutoTrk 


14.7 


12.2 


12.1 


13.6 


Low  Altitude 


ManTrk 


5.6 


3.4 


□ 

□ 


"Statistically  Significant 
Difference" 

Hypothesis  testing 
a  and  3  errors 
Confidence  intervals 
Sample  size  via  OC  curves 
Random  va.  Causal  variation 
Regression  coefficients 


Scallei  Plot  MaliixDsmonstiation  Plot 
Loast  Squares  Potynomial  Fit 


PiobabilitjiDensitif  Function 
fF(x,10,10) 


Probability  Distiibution  Function 
p=iF(x,10,10) 


'iANOVA;Var. 

LOCERRP( 

7;  R-sqF.37782;  A(lj:.228 

5 

WB 

EXPEEIM. 

DESIGN 

2**(3-0)  design;  MS  Eesiduai=10450 . GG 

DV:  LOCERRPC 

Factor 

SS 

df 

MS 

F 

P 

I  fDTHREAT 

39466.7  1 

1 

39466.67 

3.77639G 

.063321 

(2)B0UHD 

7602.1 

1 

7602. OG 

.727411 

.401G21 

(3)JIHK 

32G64.G 

1 

32G64.76 

3.1446G9 

.0GG359 

1  by  2 

61.6 

1 

1  61.56 

.005G90 

.939436 

1  by  3 

7G61G.9 

1 

7G61G.G6 

7.522705 

.011100 

2  by  3 

45.0 

1 

45.02 

.004307 

.94G194 

Error 

261271.9 

25 

10450. GG 

Total  SS 

419930. G 

31 
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Resampling 
crystallizes  thinking 


Baron  Von  Munchhausen 


□  Why  must  a  be  set  to 
compute  (3  error? 

□  What  is  the  root  problem 
behind  missing  data  and 
unbalanced  designs? 

□  What  assumptions  are  we 
making  about  the  sample 
when  we  apply  the  t  test? 
Are  they  appropriate? 


What  are  the  real  effects  of 
outliers  on  linear  models? 

What  exactly  are  my  null  and 
alternate  worlds? 

P 

P 

P 

i  - 
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Definitions 


Baron  Von  Munchhausen 


Bootstrap:  Statistics  done  under  the  assumption  that 
the  population  distribution  is  identical  to  the  sample 
distribution.  The  data  are  sampled  with  replacement, 
simulating  an  infinite  population  from  the  sample. 

Permutation  (Shuffling)  Methods:  The  basis  for 
Fisher’s  Exact  test  and  the  Tea  Experiment.  The  gold 
standard  of  resampling  schemes  -  sampling  is  without 
replacement. 

Resampling:  Statistics  done  under  computation¬ 
intensive  methods  involving  sampling  from  some 
distribution.  These  include  bootstrap,  jackknife,  data 
shuffling  schemes,  and  possibly  others. 

Monte  Carlo:  Simulations  done  using  random  (or 
pseudo-random)  numbers.  These  might  include 
resampling  schemes,  but  for  our  purposes  today,  we’ll 
distinguish  between  them. 
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istorical  Resampling 
Timeline^ 


Baron  Von  Munchhausen 


<3 


1650 


1700 


1850 


1900 


1925 


1950 


1975 


1980 


1990 


2000 


Gambling  Experiments  -  Blase  Paseal 
Probability  Theory  -  Bernoulli 


Student  t  Distribution  W.T.  Gosset 

The  Tea  Experiment  -  R.A.  Fisher 
Fisher’s  Exact  Test  (permutations)  -  R.A.  Fisher 

Monte  Carlo  approaches  (RAND  et  al.) 
Quenouille’s  (and  later  Tukey’s)  Jacldaiife 

Applications  to  Business  and  Economics  -  Julian  Simon 


Publication  of  Efron’s  (Stanford)  Article  on  Bootstrap 


Peter  Hall’s  Publication  of  Asymptotic  Theory  of  Resampling 


1.  Source:  Chapter  1  Bootstrap  Methods  and  Their 
Application,  AC  Davidson,  1997 
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Is  Bootstrap  a  Fraud? 


Baron  Von  Munchhausen 


“Much  to  my  chagrin,  I 
found  mysolf  at  tha 
bottom  of  the  laker 
daclared  tho  Baron. 


“But  Baron,  how  did 
you  save  yourself 
from  a  watary  fate?” 
Bxclaimed  his  lady 
listanar. 

“Why,  I  simply  roachad 
down,”  tha  Baron 
explained  “  and  pulled 
myself  up  by  my 
bootstraps.” 


Source:  www.nnunchasuen.com 
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Volume  VI 


Cosset  Resampled  to 
Generate  the  Student  t 
Distribution 


MARCH,  1908 


No.  I 


BIOMETRIKA. 


.  I 

,  THE  PEOBABLE  EREOE  OF  A  MEAN. 

By  STU  dent. 


Introdnciion. 

Any  experiment  may  be  regarded  as  forming  an  individual  of  a  “  population  ” 
of  experiments  which  might  be  performed  under  the.  same  conditions.  A  series 
of  experiments  is  a  sample  drawn  from  this  population. 

Now  aiiy  series  of  experiments  is  only  of  value  in  so  far  as  it  enables  us  to  form 
a  judgment  as  to  the 

great  number  of  cases  the  (question,  finally  turns  on  the  value 
of  a  mean,  either  directly,  or  as  the  mean  difference  between  the  two  quantities. 


P 

P 

P 

P 

P 
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□  Article  originating  the  Student  t  distribution  by 
W.T.  Gosset,  a  chemist  for  Guinness  Brewery 

□  Wrote  under  a  pseudonym  for  the  same  reason 
Romance  Writers  do  ...  Statistics  not  an 
honorable  profession 
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Gosset's  1908  Article 


Baron  Von  Munchhausen 


.  THE  PROBABLE  ERROR  OF  A  MEAN. 


By  Sl’UDENT. 


i  Inlt'oducHon. 

I 

Any  experiment  may  be  regarded  as  forming  an  individual  of  a  *'  population” 
of  experiments  which  might  he  performed  uuder  the  same  conditions.  A  series 
of  experiments  is  a  sample  drawn  from  this  population. 


Now  any  series  of  experiments  is  only  of  value  in  so  far  as  it  enables  us  to  form 
a  judgment  as  to  the  statistical  constants  of  the  population  to  which  the  experi¬ 
ments  belong.  In  a  great  number  of  cases  the  question  fioally  turns  on  the  value 
of  a  mean,  either  directly,  or  as  the  mean  difference  between  the  two  quantities. 

If  the  number  of  experiments  he  very  large,  we  may  have  precise  information 
as  to  the  value  of  the  mean,  but  if  our  sample,  be  small,  we  have  two  sources  of 
uncertainty: — (1)  owing  to  the  error  of  random  sampling*'  the  mean  of  our  series 
of  experiment,**  deviates  more  or  less  widely  from  the  mean  of  the  population,  and 
"  (2)  the  sample  i¥  not"  suffix  large  to  determine  what  is  the  law  of  distribution 
of  individuals.  It  is  usual,  however,  to  assume  a  normal  distribution,  because,  in 
a  very  large  number  of  cases,  this  gives  an  approximation  so  close  that  a  small 
sample  will  give  ni)  real  information  as  to  the  manner  in  which  the  population 

deviates  from  normality :  since  some  law  of  distribution  must  be  assumed  it  ,is 

better  to  work  with’  a  curve  whose  area  and  ordinates  are  tabled,  and  whose 

_ 


P 

P 


Translation  -  we  use  the  normal  because  we  know  it... 
"When  all  you  have  is  a  hammer,  everything  begins  to 
look  like  a  nail." 
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Conclusion 


Baron  Von  Munchhausen 


Section  VI.  Practical  Test  of  the  foregoing  Eqmiions. 

Befoie  I  had  succeeded  in  solviiig  my  problem  analytically^  I  had  endeavoured 
to  do  so  empirically.  The  material  used  was  a  correlation  table  containing  the 
height  and  left  middle  finger  measurements  of.  3000  criminals,  from  a  paper  by 
W.  R.  Macdonell  {BiometHkd,  Vol.  i.  p.  219).  The  measurements  werp  wrihhPi. 
out  on  3000  pieces  of  cardboard,  which  wore  then  very  thoroughly  shuffled  and 

dravvn_^jt_randoim  As  e«ich  card  was  drawn  its  numbers  were  written  down  in  a 

book  which  thus  contains  the  mecisurements  of  3000  criminals  in  a  random  order. 
Finally  each  consecutive  set  of  4  was  taken  as  a  sample — 750  in  all — and  the 
mean,  standard  deviation,  and  correlation^  of  each  sample  determined. 


f 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 
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□  Simulating  a  Shuffle 
test  -  without 
replacement 

□  Could  have 
improved  his 
procedure  with 
bootstrap,  I  think, 
and  by  repeating 
the  sets  of  750  —  3 
or  4  times. 


^ _ -2 _ J _ D _ ] _ 2 _ 3_ 
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Fisher  Shuffled  for  his  Random izatfiini 
Exact)  Test _ 

In  this  clever  experiment,  Fisher 
justified  the  t  and  F  statistics  to 
examine  the  null  hypothesis  of 
sameness,  without  appealing  to  the 
Central  Limit  Theorem  or  the  Chi 
Square  Theorem.  It  makes  us  more 
comfortable  with  violations  of  the 
normality  and  equality  of  variance 
assumptions. 


Ordering  1 


Wet 

Dry 

Wet 

Dry 

30 

26 

30 

26 

11 

24 

11 

24 

25 

28 

25 

28 

14 

21 

\P 

21 

(S 

25 

.,  =  5 

n_  =  6 

yw  =  21yD  =  23 

Ay  =  2 


Ordering  2 


11 

5. 


11! 

=  516!= 


462 


Hq  [I  £) 

^  M-  D 


Fisher's  Randomization  Test  Demonstration 


These  are  two  of  462  ways  to  assign  5 
samples  to  the  “Wet”  label.  Under  Hq, 
(W=D),  the  W/D  labels  make  no  difference 
in  y  values  and  thus  the  samples  randomly 
appear  in  each  column.  We  plot  the 
distribution  of  mean  differences,  and 
observe  how  often  a  difference  in  2  of 
means  appears.  This  is  the  randomization 
test 


scaled  to 


A 

. 

2^ 

In  this  problem, 
the  alphas  for  both 
the  t  and  the 
randomization  test 
agree: 

=  .34 

^RAND  ^ 


■3,5  -3,0  -2,5  -2,0  -1,5  -1,0  -0,5  0,0  0,5  1,0  1,5  2,0  2,5  3,0  3,5 
Distribution  of  462 1  Statistics  from  Randomization  Test 


Both  t  and  F  can  be  shown  to  fit  this  distribution  . . . 
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Concept  of  an  External 
eference  Distribution 
and  Alpha  Error 


Baron  Von  Munchhausen 


5m  25m 


□  Suppose  I  am  a  data  collector  - 
only  the  best!  I  have  historical 
records  of  B-IB  dry  data  sets 

□  Someone  offers  me  a  data  set 
claiming  it's  Bl-B  dry  data,  x  - 
25m,  n  =  10 

□  Examining  my  reference  collection, 
I  observe  that  only  6.75%  of  my 
487  samples  equal  or  exceed  this 
value 

□  Accordingly,  I  reject  this  set  with  a 
6.75%  change  of  being  mistaken 

□  I  find  likelihood  that  I  see  a  mean 
of  25m  given  this  is  B-IB  dry  is 
low 
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Developing  a  Random 
Variable  Distribution 


Samples 


Step  2.  Assess  a  discrete 
or  continuous  curve  that 
describes  how  samples  are 
observed 


^  y  jj  Baron  Von  Munc^hausen 

Step  1.  Lollect  some  data 


□  A  random  variable  converts  experimental 
outcomes  into  a  numeric  variable  that  can 
be  discussed  with  the  tools  we  are  learning. 

□  What  proportion  of  RWR  age  outs  are 
greater  than  12  seconds? 

□  With  a  reference  distribution  you  can 
compute  "unlikelihood" 

□  Probability  distribution:  Features  P(x) 
values  for  each  outcome.  P(x)  are  between 
zero  and  one.  The  p's  sum  to  1.0 

□  We  have  several  classical  theoretical  ones: 
binomial,  t,  z,  chi-square,  F  ... 


June  2007  Resampling  Statistics  Tutorial  R-17 


■ 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

i 


Concepts  of  Resamplin 


Saron  Von  Munchhausen 


Re-Sample 

□ 

5-8 

■■ 

Age  out  time 

□ 


□ 


□ 


□ 


As  with  classical  statistical  theory  - 
what  we  know  of  the  population  is 
contained  in  the  sample 

With  classical  stats,  we  use 
mathematical  theorems  to  estimate 
the  behavior  of  the  population 

With  resampling  -  we  draw  repeated 
samples  from  our  original  sample  and 
use  that  evidence  to  describe  the 
population 

■  Conceptually  easier  to  swallow  (and 
teach) 

■  Fewer  distributional  assumptions 

■  More  flexible  in  practice 

Let's  see  where  this  idea  takes  us... 
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□ 


□ 


s  we  re-sample,  we  obtai 
different  values  for  a 
Random  Variable 


Baron  Von  Munchhausen 


Original 
sample  of  13 
weapon 
scores 

Resample  in 
groups  of  13 
(with 

replacement) 


20  Degree  Dive  Bomb  Bomb  Scores 


Sampling  Distributions  of  Random  Variabie 


Sample 

X 

Y 

ReSample 

MeanY 

MedY 

SDY 

X(90) 

1 

-4 

1 

1 

33.2 

36.0 

17.5 

15.8 

2 

-3 

36 

2 

15.2 

18.9 

20.5 

36.0 

3 

-9 

36 

3 

33.2 

33.2 

0.7 

33.2 

4 

3 

14 

K  4 

15.2 

33.2 

32.7 

23.3 

5 

-3 

19 

)  ' 

15.8 

32.7 

15.2 

17.5 

6 

10 

20 

'  6 

17.5 

15.8 

33.2 

0.7 

7 

-3 

15 

7 

17.5 

18.9 

32.7 

35.9 

8 

8 

23 

8 

32.7 

23.3 

15.2 

15.2 

9 

0 

33 

9 

0.7 

36.0 

15.8 

15.2 

10 

-14 

33 

10 

15.2 

33.2 

13.7 

13.7 

11 

-3 

16 

12 

5 

18 

13 

■6 

8 

Resample 

19.6 

32.9 

10.4 

36.0 

Original 

22.6 

18.9 

11.0 

35.4 

20  Deg  DB  Scores 
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Cross  Track 
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Distribution  of  the  Mean 
as  we  re-sample 
(sampling  distribution) 


Baron  Von  Munchhausen 


12.0 

13.8 

100  means  ofr  samples  ofn=10, 
resampled  from  the  13  original  Y  scores. 
The  distribution  of  these  means  gives  us 
a  gauge  to  judge  claims  about  the  true 
population  mean.  Is  it  likely  the  true 
population  mean  is  8.0?  What  about 

T ^ 

13.9 

14.8 

15.0 

15.3 

15.6 

15.8 

15.9 

16.5 

Bin 

Freq 

16.7 

Original  Sample  Mean 

12 

1 

16.8 

22.6 

14 

1 

16.8 

16 

4 

16.9 

Resampled  Means 

17 

10 

17.3 

Mean 

20.5 

19 

26 

17.4 

5th 

15.0 

21 

18 

17.6 

95th 

25.7 

23 

15 

17.6 

25 

12 

17.8 

27 

9 

17.8 

28 

2 

18.1 

More 

2 

18.2 

18.3 

18.4 

18.5 

Sampling  Distribution  of  100  Means 
Y  bomb  scores 


EUl 


Bin 

I  I  i  I  r 


□  Resampling  statistics  (from  Simon  '68 
and  Efron  79)  supply  us  with  that 
empirical  reference  distribution  we  had 
been  hoping  for 

□  We  can  make  probability  statements 
regarding  the  behavior  of  any  function 
computed  from  the  sample 
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p 
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p 
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p 
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Central  Limit  Theorem  - 
Recap 


Baron  Von  Munchhausen 


Given: 

1.  Random  variable  x  distributed  arbitrarily  with 

mean  |j  and  standard  deviation  a. 

2.  Draw  samples  of  size  n  are  randomly 

(independently!)  from  this  population. 

Conclusions: 

1.  The  distribution  of  sample  means  X  will,  as 

the  sample  size  increases,  approach  normal. 

2.  The  mean  of  the  sample  means  will  be  the 

population  mean  p. 

3.  The  sta  deviation  of  the  sample 

means  / —  i  (the  standard  error  of 

the  meat  //^ — 

Practical  Rules  of  Thumb 

1.  For  samples  of  size  n  larger  than  30,  the  distribution  of  the 

sample  means  can  be  approximated  reasonably  well  by  a 
normal  distribution.  The  approximation  gets  better  as 
the  sample  size  n  becomes  larger. 

2.  If  the  original  population  is  itself  normally  distributed,  then 

the  sample  means  will  be  normally  distributed  for  any 
sample  size  n. 
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Distribution  of  other 
Statistics  from  the 
sample 
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15.5 


15.5 


15.5 


15.5 


■ 

P 


16.4 


16.4 


16.4 


16.7 


16.7 


16.7 


16.7 


17.4 


17.4 


17.4 


17.4 


17.4 


17.4 


17.4 


17.5 


17.5 


17.5 


18.1 


18.1 


18.1 


18.2 


18.2 


18.2 


18.2 


18.2 


18.2 


18.2 


18.2 


18.2 


18.2 


18.2 


18.9 


18.9 


18.9 


18.9 


Original 


18.9 


Resample 


Median 


5th 


95th 


19.0 


14.48671 


28.22049 


13.7 


15.8 


17.8 


19.9 


22.0 


24.0 


26.1 


30.2 


32.3 


More 


Frequency  Cuniulati«% 


34 


27 


75.00% 


92 


92 


82.00% 


Sampling  Distribution  of  the  Median 


Bin 


□  There  are  combinatoric  ways  to  solve  for 
the  sampling  distribution  of  the  median, 
but  not  with  normal-theory  methods. 
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Sampling  distributions 
enable  us  to  judge 
Significance 


Baron  Von  Munchhausen 


Sampling  Distribution  of  100  Means 
Y  bomb  scores 


P 


Bin 


□  What  is  the  likelihood  that  the 
true  population  average  is  8?  19? 


35? 
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□  Reliability  can  be  viewed  as 
binomial  -  in  any  mission,  it  fails 
or  does  not 


□  JASSM  reliability  was  one  of  the 
motivating  problems 

□  We  worked  a  sample  size  (power) 
problem  with  a  20  missile  launch 
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One  sample  of  means,  medians, 
lercentiles 
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□ 


□ 


We  used  the  JPF  as  a  motivating 
example 

Does  the  test  evidence  support 
the  claimed  reliability? 


r 

P 

P 
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m 

I  □  We  used  the  "SWEAT  Diet"  as  one 
I  of  our  motivating  examples 

U  □  Question  1:  do  the  diets  differ  in 
I  weight  lost? 

I  □  Question  2:  if  so,  by  how  much? 

I 

I 

i  - 

■  June  2007  Resampling  Statistics  Tutorial  R-26 
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p 
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Resampling  Poses  Three 
Sample  Size  Problems 
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□ 


□ 


□ 


How  many 
samples  to  draw 
from  original 
data  (d) 

25.0 

■  Best  guidance 
-  0(n) the 

original  sample 

-20.0 

but  NLT  10-15 

o 

to 

How  many  times 

03 

C 

to  resample 

<^15.0 

ca 

original  data  (r) 

■  Best  guidance 

£ 

03 

--  >500  times. 

^10.0 

1000 

LLI 

converges 

O 

03 

most  problems 

^  5.0 

thus  far 

> 

How  many 

original  sample 

0.0 

data  values  (n) 

■  Next  slide 

Percentiles  of  Resample  Distribution  vs  r. 


J)01 


■01 


100  250  500  750  1000  5000  10000  50000 

Number  of  Resamples  (r) 
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t  Test  Sample  Size  a  and  (3 
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OC  curves  for  different  values  of  n  for  the  two-sided  t  test  for 
a  level  of  signifieanee  a  =  0.05 


1  Sample 


2  Sample  (Indl) 


RWR  AO  <  12  see  a^P  =  .05  RWR  v23  v40  Range  diff  =  3 

M^i  ■  ^  ^  p  =  0.10a  =  2  miles 

a  =  2  see  A  =  .75  ^  n  =  20 


A  =  .0.5  ^  n  =  60 


n,  =  m  =  n*  = 


n+  1  21 


Resamp 


=  10.5-11 


ll,n2=  11 


)rial 


R-28 
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p 

p 
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p 
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How  many  original 
samples  (n) 


Baron  Von  Munchhausen 


□  As  in  other 
areas, 
resampling 
clarifies  the 
problem  and 
illuminates  the 
solution 

□  We  more 
clearly 

understand  the 
sample  size  big 
four: 

■  a,  p,  5,  a 

□  We  can 
replicate  the  t 
and  F  test 
operating 
characteristic 
curves 

experimentally 


ExperimentallTestOC  Curve 


1  — ^ 

r - 

-♦-n=5 

\\\ 

-»-n=10 

-*-n=15 

\  w 

n=20 

H«-n=25 

“•“0=30 

\  w 

0.1 

0.2  0.4  0.0 

Delta 
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Classical  Approaches  to 
Multivariable  Testing  are 
Difficult  to  Master 


Baron  Von  Munchhausen 


□  Consider  the  JSOW  delivery: 

■  Three  terminal  delivery  parameters 

■  Several  threat  conditions 

■  Two  profiles 

■  Two  target  types  (fixed,  mobile) 

■  Two  sets  of  launch  conditions  (in 
range-in  zone) 

■  Number  of  combinations? 

□  Our  problems  are  multi-variable 

□  So  must  be  our  solutions 

■  ANOVA 

■  Regression 

■  Resampling 


r 

P 

P 
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Compare  to  simply 
checking  ANOVA 
assumptions 


X  Variable  vs.  Residuals  in  Statistica  6 


Auto  High 


CDIPLow 
Mode  Altitude 


samples  independent  of  t,x,y... 
2  equal  variances  in  all  cells 
i  residuals  distributed  normally 
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On  Review/Save  Residuals  tab, 
choose  and  save  all  factors  you 
want  to  plot  against  residuals 

□  On  resulting  scroll  sheet,  do 
scatter  plots  with  residuals  as  Y 
and  X  or  time,  or  other  variable 
as  X 
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WeYe  teaching  our 
testers  to  create 
multivariable  tests 


Baron  Von  Munchhausen 


Profile  Oval 


hour  run  sheet 


Total  profile  length  ■■  8  minutes 
2  engagements,  2  notches,  1  df  arc 
altitude  and  airspeed  nominal 
7  per  hour 


Engagement 


0  Coming  on 


2  outbound 


6  outbound 


10 


12 


13 


14 


15 


Direction 


inbound 


inbound 


outbound 


inbound 


inbound 


8  outbound 


inbound 


outbound 


inbound 


outbound 


Inbound 


outbound 


Going  off 


ECM 


dry 


CC 


CC 


CC 


CC 


new' 


new 


new' 


newl 


new  2 


new  2 


new  2 


new  2 


dry 


dry 


dry 


Manuever 


none 


none 


none 


3-1 


3-1 


none 


none 


3-1 


3-1 


none 


none 


3-1 


3-1 


none 


3-1 


none 


□  Factorial  experiment  in  Mobcap 
Apex  threat  exploitation  examines 

■  4  ECM  techniques, 

■  two  maneuvers,  and 

■  two  directions 

□  All  in  a  single  one  hour  mission 
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NOVA  partitions 
variability  around  the 
group  means 
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Scatterplot(VGP02x3x4.STA12v‘48c) 


Group  mean 


Grand  Mean 


0  C_RATE:low 
D  C_RATE:Med 
»  C_RATE:High 
25  A  C_RATE:  para 


□  ANOVA  decomposes  total  variation  into 
explained  and  unexplained  portions 
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Regression  Partitions  Variance 
around  line  (not  grand  mean  as 

ANOVA) 

Assumptions 
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£-^~N(0,<t2) 
£■■  independent 


Regression  Coefficients 


y0Q  =  y-b,x 


Ax 


r,  11  c2_2;(y/-yf 
Recall  S  - - - - ]□[ 

^  n  - 1 


Summing  the  Squares  of  the 
Components  of  Variance 

(yry)=(y/-y/)+(y/-^ 
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By; -yf= By  -y)' + By  -y)' 

M  /=!  i=\ 

SST  =  SSE  +  SSR 


lying  Resampling  to  DOE 


Baron  Von  Munchhausen 


□  Sample  case  had  a  two  variable 
design  with  an  overall  mean  of  40 
and  an  A  effect  of  60  in  the 
presence  of  15  units  noise 

□  Resampling  Model  estimated  as 
follows  (90%  confidence  interval) 


□  This  case  shows  that  resampling 
gives  similar  answers  to  classical 
methods  Only  A  is  active  in 
changing  Y;  B  and  AB  are  inactive 
(effect  Cl's  includes  0) 
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Resampling  vs.  Classical  ANOVA 


I  □  We  get  substantially  identical 
i  results  with  resampling  and 

¥  ANOVA 

I _ □  The  algorithm  is  simple  and 

E  resistant  to  "misuse" 


Resampling  Results  (Within  Rows) 


L 

u 

Estimate 

Real 

Cl  Width 

A 

56.2 

79.1 

67.5 

60.0 

22.9 

B 

-137 

83 

-22 

00 

AB 

-74 

156 

48 

00 
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Warning  -  the  Akerson^ 
Smear  Effect 
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□  Suppose  I  have  two  strong  effects 


-  A  at  60  and  AB  at  20 
□  Estimate  these  effects  via  shuffling 


\  Y 

Case 

A 

B 

AB 

V  2 

3 

4 

YBar 

(1) 

■1 

■1 

1 

24,1 

\  19,6 

29,7 

49,4 

30.7 

b 

■1 

1 

■1 

■24,3 

^  -29,3 

■1,6 

■8,0 

■15.8 

a 

1 

■1 

■1 

32,4 

57,8 

34,4 

80,0 

51.2 

ab 

1 

1 

1 

85,6 

61,0 

69,2 

70,5 

71.6 

Effects 

53.9 

■13.0 

33.5 

34.4 

•  When  looking  up  the  values  in  the  resampling  distribution 
•  A  of  53.9  is  at  the  99.4  percentile  -  a  strong  effect 


•  AB=33.5  is  at  the  87.5  percentile  -  an  insignificant 
value 

•  What  happened?  The  strong  A  effect  is  smeared  over  the 
rest  of  the  data  set,  masking  the  AB  effect.  Permutation 
fails  here. 

•  When  resampling  within  rows  (to  keep  A  where  it  belongs) 

•  AB=33.5  is  at  the  99.9^^  percentile  where  it  belongs 


1 .  Discovered  by  LlCol  Jerome 
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Resampling  From  Non- 
Normal  Data 


Baron  Von  Munchhausen 


□ 


I  □ 
P 
P 
P 
P 
P 
P 
P 
P 
P 
P 
P 
P 
P 
P 
P 


□ 


30  years  experience  teaches: 

"When  resampling  and  classical  methods 
are  applied  to  classical  problems  they 
agree.  When  applied  to  no-classical 
problems,  resampling  usually  gets  the 
right  answer." 

Procedure: 

■  Create  and  examine  data  that  violates 
normality  &  equal  variance 
assumptions  for  ANOVA 

Findings: 

■  For  Boot  Strap  and  transformed- 
ANOVA  confidence  intervals  will  not  at 
all  agree 

■  Boot  strap  can  other  useful 
distribution  statistics— including 
median  and  percentile 


r 

P 

P 
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Some  cautionary  notes  on 


Baron  Von  Munchhausen 


resam 


non-normal  data 


1 .  There  is  a  bias  in  bootstrap  estimates  of 

variance  in  Confidence  intervals  --  therefore 
in  significance  tests. 

2.  Bias  can  be  fixed  using  a  second  bootstrap 

procedure. 

3.  Normal  assumptions  turn  out  to  be  pretty  good 

until  the  data  get  way  off.  Strongly  bivariate 
data  are  the  case  we’ll  show. 

4.  So  far,  large  skewness  doesn't  appear  to  be  a 

major  factor. 

5.  Regardless  of  this  bias,  there  are  numerous 

important  roles  for  bootstrap. 

Warning  for  beginning  users  --  dangerous  curves 
ahead. 
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Mechanics  of  analysis  via 
Resampling  are  simple 
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□  Algorithm: 

■  Set  up  a  (possibly  constructive) 
sample 

■  Compute  a  function  of  the  sample 
observations  (Random  Variable) 

■  Choose  a  resampling  method 

□  Permutation  (shuffle) 

□  With  replacement  (Bootstrap) 

□  With  single  item  deletion 
(Jackknife) 

■  Resample  the  original  sample 

■  Review  the  output  for  the  distribution 
of  the  Random  Variable 

■  Draw  conclusions 
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Summary  and  Questions 
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Summary  of  Resampling 


■ 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 

P 
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□  53'’'^  Wing  power  and 
confidence  across  broad 
battlespace  as  test  method 

□  Real  world  poses  challenges  to 
classical  assumptions 

□  Resampling  provides  a  method 
(potentially)  that  is: 

■  Simple  to  teach  and  use 

■  Resistant  to  misuse 

■  Robust  against  violations  of 
assumptions 

■  Helps  with  missing  data 

■  Dealing  directly  with  the 
problem 

■  Illustrative  of  difficult 
statistical  concepts 

□  Progress  report  after  5  years 
of  investigation  -  very 
promising  -  but  Akerson 
Smear  a  troublesome  issue 


“To  call  in  the 
statistician  after  the 
experiment  is ... 
asking  him  to  perform 
a  postmortem 
examination:  he  may 
be  able  to  say  what 
the  experiment  died 

of." 

Address  to  Indian 
Statistical  Congress, 

1938. 
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□  Peter  Bruce,  Resampling  Stats, 
Inc.  pbruce(g) resample.com, 
www.resamDle.com.  the  single 
best  resource  to  learn  these 
methods 

□  Resampling:  the  New  Statistics, 
Simon  and  Bruce 

□  Bootstrap  Methods,  Michael 
Chernick 

□  Permutation  Tests,  Phillip  Good 

□  R:  A  statistics  ianguage,  freeware 
for  download  -  numerous  sites 

□  Gregory.  hutto(g)eQlin. af.mil 
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Questions  111 
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